Monitoring Deforestation In Landsat Data Cubes Using NDVI
The deforestation of the rainforest is a important topic in science and public. Especially due to the worsening of climate change, there are more and more calls for countries that contain large parts of the rainforest to stop deforestation. This is especially true for Brazil, which has the largest amount of tropical rainforest in the world, covering an area of 318,7 ha (statista). For a long time it seemed that Brazil could counteract deforestation. At the beginning of the 2010s, Brazil managed to reduce deforestation of the rainforest by 84%, compared to the historic peak of 2004. Thus, in 2012, only 4,571 km2 were cut down instead of 27,772 km2 in 2004 (nature ecology & evolution: The Brazilian Amazon deforestation rate in 2020 is the greatest of the decade). However, this trend has reversed since 2013. The Brazilian Amazon Deforestation Monitoring Program (PRODES) estimated deforestation of 11,088 km2 for 2020, which corresponds to the highest value for the entire decade. This upward trend in deforestation was triggered by changes in the Brazilian Forest Code and new laws which may allow illegally grabbed public land to be legalized (Clarifying Amazonia’s burning crisis). Thus, illegal clearing of the rainforest, which is mostly difficult to record, occurs over and over again.
With this work we present a tool to capture deforestation of the Brazilian rainforest in near real time by analyzing the NDVI in Landsat Data Cubes. Using previously collected reference data for a study area in the Brazilian Amazon, we calculate the changes in NDVI after deforestation in order to use these values for future detection of deforestation. By comparing the NDVI change of satellite images from two timesteps, we analyze deforestation. A land is identified as deforested if the NDVI decrease is over a determined threshold. To make the satellite images usable, clouds were removed and thus missing pixel values were replaced by the observated values of future timestamps. In addition seasonal variations where removed, which would otherwise have disguised erroneously deforestation in the data. The methodology was largely taken from the paper Monitoring Deforestation at Sub-Annual Scales as Extreme Events in Landsat Data Cubes, leaving out some components that were not interesting for us and adding others that we needed, e.g. for the near real time analysis. In the following, we specify our used data, briefly describe the used methods as well as results and discuss them afterwards.
To perform our calculations, we choose a study area located in the northwest of Brazil. This area is located along the border of the Brazilian states Randônia, Amazonas and bordering the Bolivian state of Pando (compare Figure TODO). We choose this study area because it is completely covered by the area of the native rainforest and is also affected by deforestation. The extent was chosen to be larger than the deforestation area itself, to avoid a smoothing out effect when deseasonalising the satellite data. To carry out our methods, two types of data were needed. On the one hand satellite data to calculate the NDVI and on the other hand reference data when and where specifc areas were recognized as deforested. We need both data to inspect the NDVI change of an deforested area. The satellite data we used are Landsat 8 data in GeoTIFF format in the CRS WGS 84 / UTM zone 19N, with a spatial extent of [left = -7347259, right = -7314864, top = -995476.1, bottom = -1025490] and a temporal extent 2014-01-01 until 2019-12-31. The raw data is available here. In addition to the data needed to calculate the NDVI, data was also needed where deforestation had already been detected. We obtained these data from the PRODES program, already mentioned in the introduction. They publish data for the deforested areas in Brazil larger than 6.25 hectares starting from 2008. Wherby they considers deforestation to be the suppression of native vegetation, regardless of the future use of these areas. We have downloaded the corresponding data as shapefiles (TerraBrasilis catalogue).
As already briefly mentioned in the introduction, the basic idea of our approach comes from the paper Monitoring Deforestation at Sub-Annual Scales as Extreme Events in Landsat Data Cubes. Their basic idea was to divide an NDVI image time series into a reference cube and a monitoring cube. The reference cube contains historical observations where non-forest pixels have been masked, and the monitoring cube contains newly acquired observations, not yet assessed for deforestation. Based on training data and observations in the reference cube a threshold percentile for defining an observation as abormally low was calculated and applied for each pixel value in the monitoring cube. If the pixel is below the threshold percentile, the pixel is flagged as deforested or potential deforested. We adopted this basic idea and also carried out usual preparatory steps, such as a filling of cloud gaps and a deseasonalization of the data.
First, both the reference data and satellite images must be converted into a format that we can use. To start with the reference data, we loaded the data from PRODES into QGIS to prepare it accordingly. We changed the coordinate reference system from SIRGAS 2000 to EPSG:3857 - WGS 84/ Pseudo-Mercator as our satellite data were already in this format. Since the data were always available for individual days, we have combined them into months. As can be seen here, the data were ultimately available in this monthly format for four to 9 months, and predominantly in the second half of the year. The PRODES data are shapefiles, and were thus converted into the raster format which is also used for our satellite data. The last step was to crop the data to the study area described above. Similar steps that were performed for the PRODES data in QGIS, were performed for the satellite data in R using gdalcubes.
We split the images in a reference time series with a temporal extent from 2014 to 2018 and a monitoring time series with a temporal extent for 2019. Therefore we create gdal cubes. With the help of a mask, only those pixels are considered that are either clear or water in order to avoid calculation difficulties caused by clouds. For the reference time series, two data cubes are created, one with a spatial resolution of 250 m for the calculation of the NDVI and one with a spatial resolution of 1000 m for the deseasonalization of the data. This lower resolution in general but especially for the deseasonalization was necessary due to lack of computer power. With the creation of the data cubes, the spatial extent is set to the study area, the temporal extent is limited to the above mentioned dates, whereby the size of of pixels in time-direction is about one month. Then the NDVI is calculated for the data cubes. Using a mask to consider only cloud-free pixels inevitably results in pixels with not available values (na-values) that are not of the two selected classes for the mask. In order to still be able to perform our methods, these na-values must be filled. To do this, we fill these pixels with the next pixel that is available of the next observation carried backward in time. We have deliberately chosen to fill with pixels that are in future time steps, as otherwise the deforestation could be temporarily obscured. In the following the NDVI images are spatially normalised to reduce saisonality. Based on the justification from the paper Monitoring Deforestation at Sub-Annual Scales as Extreme Events in Landsat Data Cubes we divide for each time step each pixel value by the 0.95 NDVI quantile of the whole time step.
## Loading required package: sp
## Loading required package: terra
## terra version 1.1.4
## Loading required package: lattice
## Loading required package: latticeExtra
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO", prefer_proj =
## prefer_proj): Discarded ellps WGS 84 in Proj4 definition: +proj=merc +a=6378137
## +b=6378137 +lat_ts=0 +lon_0=0 +x_0=0 +y_0=0 +k=1 +units=m +nadgrids=@null
## +wktext +no_defs +type=crs
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO", prefer_proj =
## prefer_proj): Discarded datum World Geodetic System 1984 in Proj4 definition
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO", prefer_proj =
## prefer_proj): Discarded ellps WGS 84 in Proj4 definition: +proj=merc +a=6378137
## +b=6378137 +lat_ts=0 +lon_0=0 +x_0=0 +y_0=0 +k=1 +units=m +nadgrids=@null
## +wktext +no_defs +type=crs
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO", prefer_proj =
## prefer_proj): Discarded datum World Geodetic System 1984 in Proj4 definition
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO", prefer_proj =
## prefer_proj): Discarded ellps WGS 84 in Proj4 definition: +proj=merc +a=6378137
## +b=6378137 +lat_ts=0 +lon_0=0 +x_0=0 +y_0=0 +k=1 +units=m +nadgrids=@null
## +wktext +no_defs +type=crs
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO", prefer_proj =
## prefer_proj): Discarded datum World Geodetic System 1984 in Proj4 definition
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO", prefer_proj =
## prefer_proj): Discarded ellps WGS 84 in Proj4 definition: +proj=merc +a=6378137
## +b=6378137 +lat_ts=0 +lon_0=0 +x_0=0 +y_0=0 +k=1 +units=m +nadgrids=@null
## +wktext +no_defs +type=crs
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO", prefer_proj =
## prefer_proj): Discarded datum World Geodetic System 1984 in Proj4 definition
With the previous mentioned steps, the satellite data were adequately prepared to examine how the NDVI had changed in relation to a deforestation event. For this purpose, we calculat for our reference time series the NDVI before and after a deforestation event. For each month in which a deforestation event was detected on the basis of the PRODES data, the NDVI of the previous month and the following month are examined and substracted from each other. We have decided to use the previous month and the following month because we have prepared all data in a monthly format and thus it is the best possible comparison in terms of accuracy. For all time steps, and thus for each pixel that was affected by deforestation, we store the difference of the NDVI for this pixel from the previous month and the following month. This gives us a list of critical NDVI changes that potentially imply a deforestation event.
The list of critical values is then used to search for deforestation in the monitoring time series. Like in the paper Monitoring Deforestation at Sub-Annual Scales as Extreme Events in Landsat Data Cubes, we also use the extreme value approach. With this approach, deforestation observations, that we have collected in the previous mentioned list are regarded as abnormally high, if they are above a specific percentile. Thus, if the difference between two pixels of two subsequent time steps in the monitoring cube is above a certain percentile of the values of our list, the pixel is considered as deforested. Our script is designed in such a way that the specific percentile can be changed and thus the results of the potentially deforested area change accordingly. To make this changeability of our methodology easily applicable and visible to everyone, we are making this research available as an ERC. The ERC TODOTODO is available here and can be changed via the MANIPULATE menu item. In this way, the percentile at which a NDVI change is considered deforested can be changed interactively as binding and a map showing the deforested areas will change accordingly. The example given in the ERC also includes a table for evaluation. For the monitoring cube used by us, it is checked whether the detected deforestation has really taken place, i.e. whether it has also been detected by PRODES. For this purpose, the difference for the NDVI in one month and the second following month was again calculated in the monitoring cube. If a deforestation event was reported by PRODES in the intervening month, the detection on basis of the NDVI is considered correct. The table thus interactively displays the percentage of correctly recognized and incorrectly recognized values in relation to the currently choosen percentile.
With our work we archived different Results. First we calculated how the NDVI change of a deforested region from 2014 to 2018. With this data we can calculate a percentile, at which we th
load("data/NDVIchanges.Rdata")
hist(changes)
- über false postive… berichten - was passiert wenn man das quantile verändert